---
title: How to detect and respond to zombie flows
sidebarTitle: Detect Zombie Flows
description: Learn how to detect and respond to zombie flows.
---

Sudden infrastructure failures (like machine crashes or container evictions) can cause flow runs to become unresponsive and appear stuck in a `Running` state.

To mitigate this, flow runs triggered by deployments can emit heartbeats to drive Automations that detect and respond to these "zombie" flow runs, ensuring they are marked as `Crashed` if they stop reporting heartbeats.


### Enable flow run heartbeat events

You will need to ensure you're running Prefect version 3.1.8 or greater and set `PREFECT_RUNNER_HEARTBEAT_FREQUENCY`
to an integer greater than 30 to emit flow run heartbeat events.


### Create the automation
To create an automation that marks zombie flow runs as crashed, run this script:
```python
from datetime import timedelta

from prefect.automations import Automation
from prefect.client.schemas.objects import StateType
from prefect.events.actions import ChangeFlowRunState
from prefect.events.schemas.automations import EventTrigger, Posture
from prefect.events.schemas.events import ResourceSpecification


my_automation = Automation(
    name="Crash zombie flows",
    trigger=EventTrigger(
        after={"prefect.flow-run.heartbeat"},
        expect={
            "prefect.flow-run.heartbeat",
            "prefect.flow-run.Completed",
            "prefect.flow-run.Failed",
            "prefect.flow-run.Cancelled",
            "prefect.flow-run.Crashed",
        },
        match=ResourceSpecification({"prefect.resource.id": ["prefect.flow-run.*"]}),
        for_each={"prefect.resource.id"},
        posture=Posture.Proactive,
        threshold=1,
        within=timedelta(seconds=90),
    ),
    actions=[
        ChangeFlowRunState(
            state=StateType.CRASHED,
            message="Flow run marked as crashed due to missing heartbeats.",
        )
    ],
)

if __name__ == "__main__":
    my_automation.create()
```

The trigger definition says that `after` each heartbeat event for a flow run we `expect` to see another heartbeat event or a
terminal state event for that same flow run `within` 90 seconds of a heartbeat event.


### Adjusting behavior with settings

If `PREFECT_RUNNER_HEARTBEAT_FREQUENCY` is set to `30`, the automation will trigger only after 3 heartbeats have been missed.
You can adjust `within` in the trigger definition and `PREFECT_RUNNER_HEARTBEAT_FREQUENCY` to change how quickly the automation
will fire after the server stops receiving flow run heartbeats.

You can also add additional actions to your automation to send a notification when zombie runs are detected.